AITopics | generating training data

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

Neural Information Processing SystemsDec-23-2025, 16:43:57 GMT

Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e.g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e.g., BERT) have been the prominent choice for natural language understanding (NLU) tasks. While both types of models have achieved promising few-shot learning performance, their potential for zero-shot learning has been underexplored. In this paper, we present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any task-specific data: A unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectional PLM. With quality training data selected based on the generation probability and regularization techniques (label smoothing and temporal ensembling) applied to the fine-tuning stage for better generalization and stability, our approach demonstrates strong performance across seven classification tasks of the GLUE benchmark (e.g., 72.3/73.8 on MNLI-m/mm and 92.8 on SST-2), significantly outperforming zero-shot prompting methods and achieving even comparable results to strong few-shot approaches using 32 training samples per class.

generating training data, language model, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.77)

Add feedback

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

Neural Information Processing SystemsOct-9-2024, 09:23:41 GMT

Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e.g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e.g., BERT) have been the prominent choice for natural language understanding (NLU) tasks. While both types of models have achieved promising few-shot learning performance, their potential for zero-shot learning has been underexplored. In this paper, we present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks without requiring any task-specific data: A unidirectional PLM generates class-conditioned texts guided by prompts, which are used as the training data for fine-tuning a bidirectional PLM. With quality training data selected based on the generation probability and regularization techniques (label smoothing and temporal ensembling) applied to the fine-tuning stage for better generalization and stability, our approach demonstrates strong performance across seven classification tasks of the GLUE benchmark (e.g., 72.3/73.8 on MNLI-m/mm and 92.8 on SST-2), significantly outperforming zero-shot prompting methods and achieving even comparable results to strong few-shot approaches using 32 training samples per class.

generating training data, language model, zero-shot learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

danvk.org » Generating Training Data

#artificialintelligenceMay-9-2016, 19:36:58 GMT

As you may remember from a previous post, I've been doing some work with a collection of old images. A key part of developing any heuristic algorithm like this one is to get some training data. You find the correct answer by hand for some fraction of the data, then judge your program by seeing how its results compare to the "golden" data. You could generate this sort of data by hand using a photo inspector and a text editor. But it would be tremendously tedious.

artificial intelligence, generating training data, machine learning, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

Filters

Collaborating Authors

generating training data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

danvk.org » Generating Training Data